Model Selection

Multi-GPU Inference

# Multi-GPU Inference

Deepseek R1 Distill Llama 70B FP8 Dynamic

The FP8 quantized version of DeepSeek-R1-Distill-Llama-70B, which optimizes inference performance by reducing the number of bits of weights and activations.

Large Language Model

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase